zram: support asynchronous GC for lazy slot freeing#735
zram: support asynchronous GC for lazy slot freeing#735blktests-ci[bot] wants to merge 1 commit into
Conversation
|
Upstream branch: 9a9c8ce |
78a4682 to
8f17195
Compare
|
Upstream branch: 028ef9c |
479475e to
e0c0f8d
Compare
8f17195 to
6b4d829
Compare
|
Upstream branch: d60bc14 |
e0c0f8d to
2d27aca
Compare
6b4d829 to
ceec5ed
Compare
|
Upstream branch: b4e0758 |
2d27aca to
acba6be
Compare
ceec5ed to
3b54e52
Compare
|
Upstream branch: 6596a02 |
acba6be to
c9d287f
Compare
3b54e52 to
6a0b974
Compare
|
Upstream branch: 507bd4b |
c9d287f to
bde187a
Compare
6a0b974 to
59ca59b
Compare
|
Upstream branch: dd6c438 |
bde187a to
ebfdc73
Compare
94f0438 to
857ada9
Compare
|
Upstream branch: dd6c438 |
ebfdc73 to
d8becc3
Compare
857ada9 to
482ce5b
Compare
|
Upstream branch: dca922e |
d8becc3 to
9f3669b
Compare
482ce5b to
5a9f7c7
Compare
|
Upstream branch: e75a43c |
9f3669b to
eb2421c
Compare
5a9f7c7 to
25a041f
Compare
|
Upstream branch: 66edb90 |
eb2421c to
45578d0
Compare
25a041f to
6f75bd1
Compare
|
Upstream branch: 6d35786 |
45578d0 to
771fefa
Compare
6f75bd1 to
1f0d33a
Compare
|
Upstream branch: 6d35786 |
771fefa to
3a419eb
Compare
1f0d33a to
b1870f6
Compare
Swap freeing can be expensive when unmapping a VMA containing many swap entries. This has been reported to significantly delay memory reclamation during Android’s low-memory killing, especially when multiple processes are terminated to free memory, with slot_free() accounting for more than 80% of the total cost of freeing swap entries. Two earlier attempts by Lei and Zhiguo added a new thread in the mm core to asynchronously collect and free swap entries [1][2], but the design itself is fairly complex. When anon folios and swap entries are mixed within a process, reclaiming anon folios from killed processes helps return memory to the system as quickly as possible, so that newly launched applications can satisfy their memory demands. It is not ideal for swap freeing to block anon folio freeing. On the other hand, swap freeing can still return memory to the system, although at a slower rate due to memory compression. Therefore, in zram, we introduce a GC worker to allow anon folio freeing and slot_free to run in parallel, since slot_free is performed asynchronously, maximizing the rate at which memory is returned to the system. Xueyuan’s test on RK3588 shows that unmapping a 256MB swap-filled VMA becomes 3.4× faster when pinning tasks to CPU2, reducing the execution time from 63,102,982 ns to 18,570,726 ns. A positive side effect is that async GC also slightly improves do_swap_page() performance, as it no longer has to wait for slot_free() to complete. Xueyuan’s test shows that swapping in 256MB of data (each page filled with repeating patterns such as “1024 one”, “1024 two”, “1024 three”, and “1024 four”) reduces execution time from 1,358,133,886 ns to 1,104,315,986 ns, achieving a 1.22× speedup. [1] https://lore.kernel.org/all/20240805153639.1057-1-justinjiang@vivo.com/ [2] https://lore.kernel.org/all/20250909065349.574894-1-liulei.rjpt@vivo.com/ Tested-by: Xueyuan Chen <xueyuan.chen21@gmail.com> Signed-off-by: Barry Song (Xiaomi) <baohua@kernel.org>
|
Upstream branch: aa54b1d |
3a419eb to
e04909a
Compare
Pull request for series with
subject: zram: support asynchronous GC for lazy slot freeing
version: 1
url: https://patchwork.kernel.org/project/linux-block/list/?series=1080269